Accuracy bounds for ensembles under 0 – 1 loss

نویسنده

  • Remco R. Bouckaert
چکیده

This paper is an attempt to increase the understanding in the behavior of ensembles for discrete variables in a quantitative way. A set of tight upper and lower bounds for the accuracy of an ensemble is presented for wide classes of ensemble algorithms, including bagging and boosting. The ensemble accuracy is expressed in terms of the accuracies of the members of the ensemble. Since those bounds represent best and worst case behavior only, we study typical behavior as well, and discuss its properties. A parameterized bound is presented which describes ensemble behavior as a mixture of depentent base classifier and independent base classifier areas. Some empirical results are presented to support our conclusions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Achievable Rates and Complexity of LDPC Codes for Parallel Channels with Application to Puncturing

This paper considers the achievable rates and decoding complexity of low-density parity-check (LDPC) codes over statistically independent parallel channels. The paper starts with the derivation of bounds on the conditional entropy of the transmitted codeword given the received sequence at the output of the parallel channels; the component channels are considered to be memoryless, binary-input, ...

متن کامل

On Universal Properties of Capacity-Approaching LDPC Ensembles

This paper provides some universal information-theoretic bounds related to capacity-approaching ensembles of low-density parity-check (LDPC) codes. These bounds refer to the behavior of the degree distributions of such ensembles, and also to the graphical complexity and the fundamental system of cycles associated with the Tanner graphs of LDPC ensembles. The transmission of these ensembles is a...

متن کامل

On Universal Properties of the Degree Distributions and Cycles of Capacity-Approaching LDPC Ensembles∗

This paper provides some universal information-theoretic bounds related to the degree distributions and the average cardinality of the fundamental system of cycles of low-density parity-check (LDPC) ensembles. The transmission of these ensembles is assumed to take place over an arbitrary memoryless binary-input output-symmetric (MBIOS) channel, and the bounds are expressed in terms of the gap b...

متن کامل

Bounds for Validation

In this paper we derive the bounds for Validation (known also as Hold-Out Estimate and Train-and-Test Method). We present the best possible bound in the case of 0-1 valued loss function. We also provide the tables where the least sample size is calculated that is necessary for obtaining the bound for a given estimation rate and reliability of estimation. For an arbitrary bounded loss function w...

متن کامل

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002